Search results for "Statistically validated networks"
showing 10 items of 13 documents
Insurance fraud detection: A statistically validated network approach
2022
Fraud is a social phenomenon, and fraudsters often collaborate with other fraudsters, taking on different roles. The challenge for insurance companies is to implement claim assessment and improve fraud detection accuracy. We developed an investigative system based on bipartite networks, highlighting the relationships between subjects and accidents or vehicles and accidents. We formalize filtering rules through probability models and test specific methods to assess the existence of communities in extensive networks and propose new alert metrics for suspicious structures. We apply the methodology to a real database-the Italian Antifraud Integrated Archive-and compare the results to out-of-sam…
Statistically validated mobile communication networks: the evolution of motifs in European and Chinese data
2014
Big data open up unprecedented opportunities to investigate complex systems including the society. In particular, communication data serve as major sources for computational social sciences but they have to be cleaned and filtered as they may contain spurious information due to recording errors as well as interactions, like commercial and marketing activities, not directly related to the social network. The network constructed from communication data can only be considered as a proxy for the network of social relationships. Here we apply a systematic method, based on multiple hypothesis testing, to statistically validate the links and then construct the corresponding Bonferroni network, gen…
Households and their Expenditures as an Evolving Complex Social System
2020
Dynamics of fintech terms in news and blogs and specialization of companies of the fintech industry
2020
We perform a large scale analysis of a list of fintech terms in (i) news and blogs in English language and (ii) professional descriptions of companies operating in many countries. The occurrence and co-occurrence of fintech terms and locutions shows a progressive evolution of the list of fintech terms in a compact and coherent set of terms used worldwide to describe fintech business activities. By using methods of complex networks that are specifically designed to deal with heterogeneous systems, our analysis of a large set of professional descriptions of companies shows that companies having fintech terms in their description present over-expressions of specific attributes of country, muni…
Backbone of credit relationships in the Japanese credit market
2016
We detect the backbone of the weighted bipartite network of the Japanese credit market relationships. The backbone is detected by adapting a general method used in the investigation of weighted networks. With this approach we detect a backbone that is statistically validated against a null hypothesis of uniform diversification of loans for banks and firms. Our investigation is done year by year and it covers more than thirty years during the period from 1980 to 2011. We relate some of our findings with economic events that have characterized the Japanese credit market during the last years. The study of the time evolution of the backbone allows us to detect changes occurred in network size,…
Quantifying preferential trading in the e-MID interbank market
2015
Interbank markets allow credit institutions to exchange capital for purposes of liquidity management. These markets are among the most liquid markets in the financial system. However, liquidity of interbank markets dropped during the 2007-2008 financial crisis, and such a lack of liquidity influenced the entire economic system. In this paper, we analyze transaction data from the e-MID market which is the only electronic interbank market in the Euro Area and US, over a period of eleven years (1999-2009). We adapt a method developed to detect statistically validated links in a network, in order to reveal preferential trading in a directed network. Preferential trading between banks is detecte…
Statistically Validated Networks for assessing topic quality in LDA models
2022
Probabilistic topic models have become one of the most widespread machine learning technique for textual analysis purpose. In this framework, Latent Dirichlet Allocation (LDA) (Blei et al., 2003) gained more and more popularity as a text modelling technique. The idea is that documents are represented as random mixtures over latent topics, where a distribution overwords characterizes each topic. Unfortunately, topic models do not guarantee the interpretability of their outputs. The topics learned from the model may be only characterized by a set of irrelevant or unchained words, being useless for the interpretation. Although many topic-quality metrics were proposed (Newman et al., 2009; Alet…
Statistically Validated Networks for evaluating coherence in topic models
2022
Probabilistic topic models have become one of the most widespread machine learning technique for textual analysis purpose. In this framework, Latent Dirichlet Allocation (LDA) gained more and more popularity as a text modelling technique. The idea is that documents are represented as random mixtures over latent topics, where a distribution over words characterizes each topic. Unfortunately, topic models do not guarantee the interpretability of their outputs. The topics learned from the model may be characterized by a set of irrelevant or unchained words, being useless for the interpretation. In the framework of topic quality evaluation, the pairwise semantic cohesion among the top-N most pr…
MEASURING TOPIC COHERENCE THROUGH STATISTICALLY VALIDATED NETWORKS
2020
Topic models arise from the need of understanding and exploring large text document collections and predicting their underlying structure. Latent Dirichlet Allocation (LDA) (Blei et al., 2003) has quickly become one of the most popular text modelling techniques. The idea is that documents are represented as random mixtures over latent topics, where a distribution over words characterizes each topic. Unfortunately, topic models give no guaranty on the interpretability of their outputs. The topics learned from texts may be characterized by a set of irrelevant or unchained words. Therefore, topic models require validation of the coherence of estimated topics. However, the automatic evaluation …
STRANIERI, MERIDIONALI O PROVINCIALI? I CONSUMI NEL TEMPO LIBERO DELLE SECONDE GENERAZIONI
2022
In this paper, we analyze consumption patterns of leisure time among young people belonging to the so-called “second generation” of immigrants in Italy. Leisure time consumption describes how young immigrants use cultural products and services. We analyze data collected by the ISTAT through the survey on the “second generations” (2015). A comparison of leisure consumption patterns between second-generation immigrants and their Italian peers does not show significant differences. Rather, differences in consumption styles are associated to gender (male/female), geographic area of residence (North/South), and size of the municipality (large municipality/small municipality) of residence.